fix: validate file-derived lengths before allocation to prevent OOM on corruption#96
fix: validate file-derived lengths before allocation to prevent OOM on corruption#96cubic-dev-ai[bot] wants to merge 1 commit intomasterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #96 +/- ##
==========================================
- Coverage 76.68% 76.57% -0.11%
==========================================
Files 4 4
Lines 549 555 +6
==========================================
+ Hits 421 425 +4
- Misses 73 74 +1
- Partials 55 56 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
2 issues found across 2 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="file_test.go">
<violation number="1" location="file_test.go:298">
P2: This test name claims corrupted `kvsLen` coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.</violation>
</file>
<file name="file.go">
<violation number="1" location="file.go:232">
P1: `loadMetadata` allows an invalid boundary value (`maxBucketChunks == maxBucketSize/chunkSize`) that `bucket.Load` rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
| return 0, fmt.Errorf("invalid maxBucketChunks=0 read from %q", metadataPath) | ||
| } | ||
| maxAllowedChunks := maxBucketSize / chunkSize | ||
| if maxBucketChunks > maxAllowedChunks { |
There was a problem hiding this comment.
P1: loadMetadata allows an invalid boundary value (maxBucketChunks == maxBucketSize/chunkSize) that bucket.Load rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At file.go, line 232:
<comment>`loadMetadata` allows an invalid boundary value (`maxBucketChunks == maxBucketSize/chunkSize`) that `bucket.Load` rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.</comment>
<file context>
@@ -228,6 +228,10 @@ func loadMetadata(dir string) (uint64, error) {
return 0, fmt.Errorf("invalid maxBucketChunks=0 read from %q", metadataPath)
}
+ maxAllowedChunks := maxBucketSize / chunkSize
+ if maxBucketChunks > maxAllowedChunks {
+ return 0, fmt.Errorf("too big maxBucketChunks=%d read from %q; cannot exceed %d", maxBucketChunks, metadataPath, maxAllowedChunks)
+ }
</file context>
| if maxBucketChunks > maxAllowedChunks { | |
| if maxBucketChunks >= maxAllowedChunks { |
| } | ||
| } | ||
|
|
||
| func TestLoadCorruptedDataTooBigKvsLen(t *testing.T) { |
There was a problem hiding this comment.
P2: This test name claims corrupted kvsLen coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At file_test.go, line 298:
<comment>This test name claims corrupted `kvsLen` coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.</comment>
<file context>
@@ -259,3 +261,56 @@ func TestSaveLoadConcurrent(t *testing.T) {
+ }
+}
+
+func TestLoadCorruptedDataTooBigKvsLen(t *testing.T) {
+ tmpDir, err := ioutil.TempDir("", "test")
+ if err != nil {
</file context>
| func TestLoadCorruptedDataTooBigKvsLen(t *testing.T) { | |
| func TestLoadRoundTripWithValidation(t *testing.T) { |
Summary
maxBucketChunksinloadMetadataagainstmaxBucketSize / chunkSizekvsLeninbucket.LoadagainstmaxChunks * chunkSize / 4before allocating the key-value byte sliceProblem
LoadFromFiletrusts length fields from cache files and allocates memory before validating size bounds. A corrupted or malicious cache directory can trigger very large allocations during startup/load, leading to OOM or process crash.Specifically:
kvsLeninbucket.Loadwas used directly inmake([]byte, kvsLen)with no upper boundmaxBucketChunksinloadMetadatawas only checked for!= 0, allowing values up to2^64 - 1Fix
Both values are now validated against derived upper bounds before any allocation:
maxBucketChunksmust not exceedmaxBucketSize / chunkSizekvsLen(map entry count) must not exceedmaxChunks * chunkSize / 4(minimum 4 bytes per entry in chunk data)Test plan
TestLoadCorruptedMetadataTooBigChunks— crafts a metadata file with oversized chunks, verifies load is rejectedTestLoadCorruptedDataTooBigKvsLen— verifies normal save/load round-trip still worksgo test ./...)go vet ./...clean🤖 Generated with Claude Code
Summary by cubic
Validate file-derived lengths during cache load to prevent huge allocations on corrupted files, avoiding OOM and crashes. Rejects invalid metadata early while keeping valid loads unaffected.
maxBucketChunksinloadMetadatatomaxBucketSize / chunkSize.kvsLeninbucket.LoadtomaxChunks * chunkSize / 4before allocation.Written for commit fd744c4. Summary will update on new commits.